120 research outputs found

    Analyse morphologique non supervisée en domaine biomédical. Application à la recherche d'information

    Get PDF
    International audienceDans le domaine biomĂ©dical, utiliser des termes spĂ©cialisĂ©s est essentiel pour accĂ©der Ă  l'information. Cependant, dans beaucoup de langues, ces termes sont des constructions morphologiques complexes qui compliquent cet accĂšs Ă  l'information. Dans cet article, nous nous intĂ©ressons Ă  l'identiïŹcation des composants morphologiques de ces termes et Ă  leur utilisation pour une tĂąche de recherche d'information (RI). Nous proposons diffĂ©rentes approches reposant sur un alignement automatique avec une langue pivot particuliĂšre, le japonais, et sur un apprentissage par analogie permettant de produire des analyses morphologiques ïŹnes des termes d'une langue donnĂ©e. Ces analyses morphologiques sont ensuite utilisĂ©es pour amĂ©liorer l'indexation de documents biomĂ©dicaux. Les expĂ©riences rapportĂ©es montrent la validitĂ© de cette approche avec des gains en MAP de plus de 10 % par rapport Ă  un systĂšme de RI standard

    Hierarchical Image Representation Simplification Driven by Region Complexity

    Get PDF
    International audienceThis article presents a technique that arranges the elements of hierarchical representations of images according to a coarseness attribute. The choice of the attribute can be made according to prior knowledge about the content of the images and the intended application. The transformation is similar to filtering a hierarchy with a non-increasing attribute, and comprises the results of multiple simple filterings with an increasing attribute. The transformed hierarchy can be used for search space reduction prior to the image analysis process because it allows for direct access to the hierarchy elements at the same scale or a narrow range of scales

    PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding

    Full text link
    Large language models (LM) based on Transformers allow to generate plausible long texts. In this paper, we explore how this generation can be further controlled at decoding time to satisfy certain constraints (e.g. being non-toxic, conveying certain emotions, using a specific writing style, etc.) without fine-tuning the LM. Precisely, we formalize constrained generation as a tree exploration process guided by a discriminator that indicates how well the associated sequence respects the constraint. This approach, in addition to being easier and cheaper to train than fine-tuning the LM, allows to apply the constraint more finely and dynamically. We propose several original methods to search this generation tree, notably the Monte Carlo Tree Search (MCTS) which provides theoretical guarantees on the search efficiency, but also simpler methods based on re-ranking a pool of diverse sequences using the discriminator scores. These methods are evaluated, with automatic and human-based metrics, on two types of constraints and languages: review polarity and emotion control in French and English. We show that discriminator-guided MCTS decoding achieves state-of-the-art results without having to tune the language model, in both tasks and languages. We also demonstrate that other proposed decoding methods based on re-ranking can be really effective when diversity among the generated propositions is encouraged.Comment: 15 pages, 5 tables, 7 figures, accepted to NAACL 202

    Local 2D Pattern Spectra as Connected Region Descriptors

    Get PDF
    International audienceWe validate the usage of augmented 2D shape-size pattern spectra, calculated on arbitrary connected regions. The evaluation is performed on MSER regions and competitive performance with SIFT descriptors achieved in a simple retrieval system, by combining the local pattern spectra with normalized central moments. An additional advantage of the proposed descriptors is their size: being half the size of SIFT, they can handle larger databases in a time-efficient manner. We focus in this paper on presenting the challenges faced when transitioning from global pattern spectra to the local ones. An exhaustive study on the parameters and the properties of the newly constructed descriptor is the main contribution offered. We also consider possible improvements to the quality and computation efficiency of the proposed local descriptors

    Satellite image retrieval with pattern spectra descriptors

    Get PDF
    International audienceThe increasing volume of Earth Observation data calls for appropriate solutions in satellite image retrieval. We address this problem by considering morphological descrip-tors called pattern spectra. Such descriptors are histogram-like structures that contain the information on the distribution of predefined properties (attributes) of image components. They can be computed both at the local and global scale, and are computationally attractive. We demonstrate how they can be embedded in an image retrieval framework and report their promising performances when dealing with a standard satellite image dataset

    Understanding the Security and Robustness of SIFT

    Get PDF
    Many content-based retrieval systems (CBIRS) describe images using the SIFT local features because they provide very robust recognition capabilities. While SIFT features proved to cope with a wide spectrum of general purpose image distortions, its security has not fully been assessed yet. Hsu \emph{et al.} in~\cite{hsu09:_secur_robus_sift} show that very specific anti-SIFT attacks can jeopardize the keypoint detection. These attacks can delude systems using SIFT targeting application such as image authentication and (pirated) copy detection. Having some expertise in CBIRS, we were extremely concerned by their analysis. This paper presents our own investigations on the impact of these anti-SIFT attacks on a real CBIRS indexing a large collection of images. The attacks are indeed not able to break the system. A detailed analysis explains this assessment

    Challenging the Security of CBIR Systems

    Get PDF
    Content-Based Image Retrieval Systems are now commonly used as a filtering mechanism against the piracy of multimedia contents. Many publications in the last few years have proposed very robust schemes where pirated contents are detected despite severe modifications. But none of these systems have addressed the piracy problem from a \emph{security} perspective. It is now time to check whether they are secure: Can pirates mount violent attacks against CBIRS by carefully studying the technology they use? This paper analyzes the security flaws of the typical technology blocks used in state-of-the-art CBIRS and shows it is possible to delude systems, making them useless in practice

    Fusion-based multimodal detection of hoaxes in social networks

    Get PDF
    International audienceSocial networks make it possible to share information rapidly and massively. Yet, one of their major drawback comes from the absence of verification of the piece of information, especially with viral messages. This is the issue addressed by the participants to the Verification Multimedia Use task of Mediaeval 2016. They used several approaches and clues from different modalities (text, image, social information). In this paper, we explore the interest of combining and merging these approaches in order to evaluate the predictive power of each modality and to make the most of their potential complementarity

    Détection de fausses informations dans les réseaux sociaux : l'utilité des fusions de connaissances

    Get PDF
    National audienceLes rĂ©seaux sociaux permettent une diffusion massive et rapide des informations. Un des problĂšmes principaux de ces canaux de communication est l’absence de vĂ©rification associĂ©e Ă  la viralitĂ© de l’information partagĂ©e. C’est ce problĂšme difficile que les participantsde la tĂąche Verifying Multimedia Use du workshop Mediaeval ont abordĂ©. Pour cela, ils ont proposĂ© plusieurs stratĂ©gies et types d’indices relevant de diffĂ©rentes modalitĂ©s (texte, image, informations sociales). Dans cet article, nous explorons l’intĂ©rĂȘt de combiner et fusionner cesapproches pour Ă©valuer le pouvoir prĂ©dictif de chaque modalitĂ© et tirer parti de leur Ă©ventuelle complĂ©mentaritĂ©

    Tampering detection and localization in images from social networks : A CBIR approach

    Get PDF
    International audienceStep 1 : Content-based image retrieval system ‱ Goal : Find the best image in our database for a comparaison with the query ‱ Method : 1. Find the 10 most similar images with a dot product using the de-scriptors explain below. Research is accelerated with a KD-Tree approach 2. Reorder these 10 candidates in order to find the best candidate. Define an homography between the image query and each candidate in order to found the best homography Descriptors used based on VGG19 [1] ‱ Two sizes used : training images size or based on a kernelization step like [2] ‱ Three vectors are analyzed based on the output of three layers: last convolutional layer C 5 with a output lenght of 512 and the fully-connected layers C 6 and C 7 with a output lenght of 4096 for both ‱ A mean or max pooling have to be apply in the case of C 5 when we used the standard training images size and the three outputs in case of kernelized approach
    • 

    corecore